Fast String Searching

نویسندگان

  • Andrew Hume
  • Daniel Sunday
چکیده

Since the Boyer-Moore algorithm was described in 1977, it has been the standard benchmark for the practical string search literature. Yet this yardstick compares badly with current practice. We describe two algorithms that perform 47% fewer comparisons and are about 4.5 times faster across a wide range of architectures and compilers. These new variants are members of a family of algorithms based on the skip loop structure of the preferred, but often neglected, fast form of Boyer-Moore. We present a taxonomy for this family, and describe a toolkit of components that can be used to design an algorithm most appropriate for a given set of requirements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

String Matching Techniques for Searching: Algorithms and Applications Term Paper -spring 2000

Searching of Information is one of the important aspect in the computer science and varies environment, including the fast growing Internet, requires application with eeective searching technique. We'll look into several aspect on searching and introduce some string matching techniques and algorithms which will be useful in varies needs of searching.

متن کامل

Fast Relative Lempel-Ziv Self-index for Similar Sequences

Recent advances in biotechnology and web technology are generating huge collections of similar strings. People now face the problem of storing them compactly while supporting fast pattern searching. One compression scheme called relative Lempel-Ziv compression uses textual substitutions from a reference text as follows: Given a (large) set S of strings, represent each string in S as a concatena...

متن کامل

Fast String Matching with Mismatches

We describe and analyze three simple and fast algorithms on the average for solving the problem of string matching with a bounded number of mismatches. These are the naive algorithm, an algorithm based on the Boyer-Moore approach, and ad-hoc deterministic nite automata searching. We include simulation results that compare these algorithms to previous works.

متن کامل

Fast approximate string matching with finite automata

We present a fast algorithm for finding approximate matches of a string in a finite-state automaton, given some metric of similarity. The algorithm can be adapted to use a variety of metrics for determining the distance between two words.

متن کامل

Fast exact string matching algorithms

String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets. © 2007 Elsevier B.V. All rights reserved.

متن کامل

Fast Filters for Two Dimensional String Matching Allowing Rotations

We give faster algorithms for searching a 2-dimensional pattern in a 2-dimensional text allowing rotations, mismatches and/or insertion/deletion errors.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Softw., Pract. Exper.

دوره 21  شماره 

صفحات  -

تاریخ انتشار 1991